Algorithms for mapping short degenerate and weighted sequences to a reference genome

نویسندگان

  • Pavlos Antoniou
  • Costas S. Iliopoulos
  • Laurent Mouchard
  • Solon P. Pissis
چکیده

Novel high-throughput (Deep) sequencing technologies have redefined the way genome sequencing is performed. They are able to produce millions of short sequences in a single experiment and with a much lower cost than previous methods. In this paper, we address the problem of efficiently mapping and classifying millions of short sequences to a reference genome, based on whether they occur exactly once in the genome or not, and by taking into consideration probability scores. In particular, we design algorithms for Massive Exact and Approximate Pattern Matching of short degenerate and weighted sequences, derived from Deep sequencing technologies, to a reference genome.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Algorithms for Mapping Short degenerate and Weighted DNA Sequences to a Reference genome

One of the most ambitious trends in current biomedical research is the large-scale genomic sequencing of patients. Novel high-throughput (or next-generation) sequencing technologies have redefined the way genome sequencing is performed. They are able to produce millions of short sequences (reads) in a single experiment, and with a much lower cost than previously possible. Due to this massive am...

متن کامل

Parallel Algorithms for Degenerate and Weighted Sequences Derived from High Throughput Sequencing Technologies

Novel high throughput sequencing technologies have redefined the way genome sequencing is performed. They are able to produce millions of short sequences in a single experiment and with a much lower cost than previous methods. In this paper, we address the problem of efficiently mapping and classifying millions of degenerate and weighted sequences to a reference genome, based on whether they oc...

متن کامل

Clustering of Short Read Sequences for de novo Transcriptome Assembly

Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...

متن کامل

Designing Of Degenerate Primers-Based Polymerase Chain Reaction (PCR) For Amplification Of WD40 Repeat-Containing Proteins Using Local Allignment Search Method

Degenerate primers-based polymerase chain reaction (PCR) are commonly used for isolation of unidentified gene sequences in related organisms. For designing the degenerate primers, we propose the use of local alignment search method for searching the conserved regions long enough to design an acceptable primer pair. To test this method, a WD40 repeat-containing domain protein from Beauveria bass...

متن کامل

A fast and efficient algorithm for mapping short sequences to a reference genome.

Novel high-throughput (Deep) sequencing technology methods have redefined the way genome sequencing is performed. They are able to produce tens of millions of short sequences (reads) in a single experiment and with a much lower cost than previous sequencing methods. In this paper, we present a new algorithm for addressing the problem of efficiently mapping millions of short reads to a reference...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • International journal of computational biology and drug design

دوره 2 4  شماره 

صفحات  -

تاریخ انتشار 2009